Entry Name: Purdue-Tang-MC2

VAST Challenge 2015
Mini-Challenge 2

 

 

Team Members:

 

Hui Tang, Purdue University,tang227@purdue.edu, Primary

 

Chao Pan, Purdue Unviersity, panc@purdue.edu

 

Bing Yu, Purdue University, yu245@purdue.edu

 

Weidan Du, Purdue University,du97@purdue.edu

Shuang Wei, Purdue University, wei93@purdue.edu

Mingran Li, Purdue Unviersity,li1940@purdue.edu

Chen Guo, Purdue University, guo171@purdue.edu

Longjie Cheng, Purdue University, cheng70@purdue.edu

Kai Hu, Purdue Univerisity, hu332@purdue.edu

Rongrong Zhang, Purdue Univerisity, zhan1602@purdue.edu

XinZhe Li, HIT University

Dr. Yingjie (Victor) Chen, Computer Graphics Technology, Purdue University, victorchen@purdue.edu (supervising faculty)
Dr. Zhenyu (Cheryl) Qian, Interaction Design, Purdue University, qianz@purdue.edu (supervising faculty)
Dr. Yu (Michael) Zhu, Statistics, Purdue University, yuzhu@stat.purdue.edu (supervising faculty)

Student Team: YES

 

Did you use data from both mini-challenges? No

 

Analytic Tools Used:

CanvasJS, Gephi

 

Approximately how many hours were spent working on this submission in total?

450hr

 

May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2015 is complete?

Yes

 

 

Video Download

Video:

https://va.tech.purdue.edu/vast2015/MC2Video/MC2Video.wmv

 

 

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Questions

 

MC2.1Identify those IDs that stand out for their large volumes of communication.  For each of these IDs

 

      a.      Characterize the communication patterns you see.

      b.      Based on these patterns, what do you hypothesize about these IDs?

 

Limit your response to no more than 4 images and 300 words.

 

 

a.

ID 1278894 has the largest volume of communication among all IDs (total 189,894 messages sent). For all of these three days, it sends messages periodically in Entry Corridor. As can be seen in figure 1, the message sending always starts after 11:00 AM. There are five sending periods each day with 1 hour-rest between them. Within each period, ID 1278894 sends almost the same number of messages every 5 minutes, and also receives a few messages.

However, the coverage of the sending messages is different. During the first sending period on each day after 11:00 AM, it has the largest coverage. However, the coverage during other time periods is much smaller.

ID 839736 is also a big source of messages (total 60812). Different from id 1278894, ID 839736 sends and receives messages continuously on Friday and Saturday in Entry Corridor. As can be seen from Figure2, the number of messages that it sends for every minute is almost the same with the number of messages it receives and both of them don’t change so much. The pattern is almost the same on Sunday, but at 12:00, there is a huge burst of message sending and receiving from this ID and another smaller one at 14:40.

 

b.

ID 1278894 might be a broadcast center related to some activities or shows in the park. Because it has a fixed location and sends a large number of messages periodically.

ID 839736 might be an information service that answer questions so the amount of message sending and receiving is almost the same. But from 12:00 to 14:40, the increasing of the amount of communication might because of some inquiries of accidents.

Figure 0. Overall Structure

Figure 1.

 

Figure 2.

 

Figure 3.

 

 

MC2.2Describe up to 10 communications patterns in the data. Characterize who is communicating, with whom, when and where. If you have more than 10 patterns to report, please prioritize those patterns that are most likely to relate to the crime.

 

Limit your response to no more than 10 images and 1000 words.

 

1.There are ids who never received any messages while sending only one message over the three-day period. For example, id 215220 sent the only message on Sunday 9:38:58 at location (80, 74) in Kiddie Land. The counterpart is an external entity. The following shows the time series of the sending and receiving activities.

 

2.Though there is no id who didn't send a single message at all, there are 14 ids who sent only one message over three days. The following table summarizes the time and location of communication for those ids.

 

time

Fromid

toid

location

fromx

fromy

tox

toy

6/6/14 9:35:57

1763672

0

Kiddie Land

81

77

0

0

6/6/14 14:47:15

1336870

0

Wet Land

22

34

0

0

6/7/14 9:32:47

596672

0

Tundra Land

41

75

0

0

6/7/14 10:38:23

1458915

0

Wet Land

32

33

0

0

6/7/14 11:27:30

1680161

0

Kiddie Land

87

48

0

0

6/8/14 9:38:58

215220

0

Kiddie Land

80

74

0

0

6/8/14 11:19:35

365259

0

Wet Land

69

44

0

0

6/8/14 11:52:54

474843

0

Wet Land

69

44

0

0

6/8/14 12:26:03

688489

0

Wet Land

63

43

0

0

6/7/14 8:51:15

1187304

839736

Wet Land

56

31

0

0

6/7/14 13:41:21

1038617

839736

Tundra Land

35

65

0

0

6/8/14 9:38:29

825934

839736

Tundra Land

49

83

0

0

6/8/14 10:16:51

1658667

839736

Wet Land

16

49

0

0

6/8/14 13:07:38

906235

839736

Tundra Land

16

66

0

0

It can be seen from the table above that the target for those who sent only one message is either an external entity or the broadcasting id 839736.

3.There are hub groups where within each there is a center id who communicated with everyone while no one else communicated with each other. The following graph shows an example happened on Sunday between 11:25:49.

 

The communication detail is shown in the following table:

time

fromid

toid

location

fromx

fromy

tox

toy

6/8/14 11:25:49

897528

1240560

Wet Land

62

42

42

20

6/8/14 11:25:49

897528

509717

Wet Land

62

42

87

68

6/8/14 11:25:49

897528

1078759

Wet Land

62

42

42

20

6/8/14 11:25:49

897528

1861415

Wet Land

62

42

62

41

6/8/14 11:25:49

897528

1445101

Wet Land

62

42

42

20

6/8/14 11:25:49

897528

457576

Wet Land

62

42

62

43

4.There are hubs with two centers where each center id sent messages to the rest of the group. However, non-center ids didn’t communicate with each other. An example is shown in the following graph:

The communication detail is shown in the following table:

time

fromid

toid

location

fromx

fromy

tox

toy

2014-06-08 11:26:10

51523

648000

Wet Land

17

43

17

43

2014-06-08 11:26:10

51523

1606701

Wet Land

17

43

17

43

2014-06-08 11:26:10

51523

730487

Wet Land

17

43

16

66

2014-06-08 11:26:10

51523

1592785

Wet Land

17

43

16

66

2014-06-08 11:26:10

51523

1095102

Wet Land

17

43

16

66

2014-06-08 11:26:10

51523

1729991

Wet Land

17

43

17

43

2014-06-08 11:26:10

51523

1753569

Wet Land

17

43

16

66

2014-06-08 11:26:37

1729991

51523

Wet Land

17

43

17

43

2014-06-08 11:26:37

1729991

648000

Wet Land

17

43

17

43

2014-06-08 11:26:37

1729991

1606701

Wet Land

17

43

17

43

2014-06-08 11:26:37

1729991

730487

Wet Land

17

43

16

66

2014-06-08 11:26:37

1729991

1592785

Wet Land

17

43

16

66

2014-06-08 11:26:37

1729991

1095102

Wet Land

17

43

16

66

2014-06-08 11:26:37

1729991

1753569

Wet Land

17

43

16

66

It is clear from the data that each of the two center ids sent messages at different time. All of the communications occurred in Wet Land. Besides, it can be seen that both centered ids were at the same location while sending messages. id 648000 and id 1606701 were with the centered ids during this period. The other non-center ids were away from the centers but they are together by themselves.

5.There are some cases where two hubs are connected through a middleman. The following graph shows an example:

 

 

where id 1377155 and id 966510 are the two centers in a hub and id 0 is a pseudo id representing an external entity. They are inter-connected through id 254060. In this topology, the two center ids sent messages to the middleman which directed messages to id 0.

The following table shows the communication details:

time

fromid

toid

location

fromx

fromy

tox

toy

2014-06-08 11:25:23

1337155

254060

Tundra Land

34

65

33

65

2014-06-08 11:26:22

966510

254060

Tundra Land

36

72

36

72

2014-06-08 11:26:30

254060

0

Tundra Land

37

73

0

0

It can be seen that the communication occurred in Tundra Land. Id 1337155 contacted the middle man id 254060 before id 966510 did. While one of the centers id 966510 and the middleman id 254060 are close, the other center id 1337155 was further from them.

6.There are large multi-center hubs which are connected to other groups through multiple middlemen.

Id 1215994, 675346, 376904, 829943, 162071, 1460660, 1952914, 170456, and 1034802 are the major communicators in this group. Besides, there are two types of members in this kind of group, one only communicates with those ids listed, the other communicates with the listed ids and other groups as well.  The former are indicated by the dots enclosed in the middle, and the latter type is represented by the dots aligned on the bottom.

The following table shows the time when each of those listed ids sent messages. Each of them sent to multiple ids:

Time

From

Location

2014-06-08 11:25:08

1215994

Coaster Alley

2014-06-08 11:25:15

170456

Entry Corridor

2014-06-08 11:25:17

1034802

Coaster Alley

2014-06-08 11:25:24

1952914

Wet Land

2014-06-08 11:26:50

1620771

Coaster Alley

2014-06-08 11:26:53

829943

Tundra Land

2014-06-08 11:26:54

675346

Wet Land

2014-06-08 11:26:54

1460660

Coaster Alley

2014-06-08 11:26:39

376904

Coaster Alley

 

It is interesting to notice that each id was in the same Land while sending messages. It is suspected that they are the local receptionists in each Land.

7.There are some instances where a member who belongs to a large group “leaks” messages to an individual who has no connection with the rest of the group. The following is an example:

Id 1022772 belongs to a large group where heavy communication occurred on Sunday between 12:25:00 and 12:26:59. However, id 584017 has no other communication except with 1022772.

time

fromid

toid

location

fromx

fromy

tox

toy

2014-06-08 11:24:08

1366526

584017

Coaster Alley

49

28

38

90

2014-06-08 11:24:28

1366526

584017

Coaster Alley

52

28

38

90

2014-06-08 11:26:22

1022772

584017

Coaster Alley

44

25

38

90

2014-06-08 11:27:40

1859126

584017

Wet Land

17

43

38

90

2014-06-08 11:28:09

1063010

584017

Wet Land

69

44

38

90

2014-06-08 11:28:10

1394108

584017

Wet Land

69

44

38

90

2014-06-08 11:28:16

715390

584017

Wet Land

15

40

38

90

2014-06-08 11:28:18

18989

584017

Wet Land

17

43

38

90

2014-06-08 11:28:27

640904

584017

Kiddie Land

71

81

38

90

2014-06-08 11:28:28

640904

584017

Kiddie Land

71

81

38

90

2014-06-08 11:28:30

967171

584017

Wet Land

17

43

38

90

The above table shows that id 1022772 sent a message to id 584017 on Sunday 11:26:22 at Coaster Alley.

8.There are occasions when the broadcasting sites sent massive messages. The following is an example where massive messages sent from id 1278894 on Sunday 12:30:00 and 12:30:01.

 

 

MC2.3From this data, can you hypothesize when the crime was discovered?  Describe your rationale.

 

Limit your response to no more than 3 images and 300 words. 

 

From this data, we can hypothesize that the crime was discovered at 12:00 on Sunday.
Firstly, because ID 839736 is considered as information service center (see our hypothesis in MC2.1), the communication pattern of it can represents the whole situation to some extent. As can be seen in Figure 1, there is a burst of message sending and receiving at 12:00 on Sunday which is different from it on Friday and Saturday. We think that many people are asking for more information at that time. So this unusual behavior can be an indication of when the crime was discovered.
Secondly, for other IDs, the pattern is different between Sunday and Saturday at 12:00. This can also support our hypothesis. For example, comparing figure2 with figure3, we can see that most people in that group send much more messages during 12:00 - 12:30 than other time periods on Sunday and the whole Saturday.

 

Figure1. ID 839736, Sun(162755)

 

Figure2. Sat(403)

 

Figure3. Sun(403)